Estimation Methods for the Size of Deep Web Textural Data Source: A Survey

نویسنده

  • Jie Liang
چکیده

The estimation of the size of deep web data sources has been an open problem since 1998. This survey reviews all papers that were available online, and other, resources, on estimating the size of data sources during the period 1998 to 2008. In the survey, we first clarify several basic terms that are used in the survey but whose meanings vary in the literature. Basic models in the literature on estimation are also discussed. The survey introduces query-based sampling approaches and reviews the estimation methods of estimating relative size and actual size of data source(s). Querybased sampling is biased. The survey also reviews research on overcoming biases caused by various estimation methods. Finally, the future direction of estimation is discussed.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

DASTWAR: a tool for completeness estimation in magnitude-size plane

Today, great observatories around the world, devote a substantial amount of observing time to sky surveys. The resulted images are inputs of source finder modules. These modules search for the target objects and provide us with source catalogues. We sought to quantify the ability of detection tools in recovering faint galaxies regularly encountered in deep surveys. Our approach was based on com...

متن کامل

بهبود تخمین منحنی مشخصه آب - خاک با استفاده از منحنی دانه‌بندی و چگالی ظاهری خاک

  Soil particle size distribution and bulk density are used for estimating soil-moisture characteristic curve. In this model, soil particle size distribution curve is divided into a number of segments, each with a specific particle radius and cumulative percentage of the particles greater than that radius. Using these data, soil-moisture characteristic curve is estimated. In the model a scale f...

متن کامل

بهبود تخمین منحنی مشخصه آب - خاک با استفاده از منحنی دانه‌بندی و چگالی ظاهری خاک

  Soil particle size distribution and bulk density are used for estimating soil-moisture characteristic curve. In this model, soil particle size distribution curve is divided into a number of segments, each with a specific particle radius and cumulative percentage of the particles greater than that radius. Using these data, soil-moisture characteristic curve is estimated. In the model a scale f...

متن کامل

SPOT-5 Spectral and Textural Data Fusion for Forest Mean Age and Height Estimation

Precise estimation of the forest structural parameters supports decision makers for sustainable management of the forests. Moreover, timber volume estimation and consequently the economic value of a forest can be derived based on the structural parameter quantization. Mean age and height of the trees are two important parameters for estimating the productivity of the plantations. This research ...

متن کامل

Ranking bias in deep web size estimation using capture recapture method

Many deep web data sources are ranked data sources, i.e., they rank the matched documents and return at most the top k number of results even though there are more than k documents matching the query. While estimating the size of such ranked deep web data source, it is well known that there is a ranking bias– the traditional methods tend to underestimate the size when queries overflow ( match m...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008